Selecting Series from DataFrame¶
Single Series¶
# conventional way to import pandas
import pandas as pd
# read a dataset of UFO reports into DataFrame
ufo = pd.read_table('http://bit.ly/uforeports', sep=',')
# read a csv is equivalent to read_table, except it assumes a comma separator
ufo = pd.read_csv('http://bit.ly/uforeports')
# examine first 5 rows
ufo.head()
City | Colors Reported | Shape Reported | State | Time | |
---|---|---|---|---|---|
0 | Ithaca | NaN | TRIANGLE | NY | 6/1/1930 22:00 |
1 | Willingboro | NaN | OTHER | NJ | 6/30/1930 20:00 |
2 | Holyoke | NaN | OVAL | CO | 2/15/1931 14:00 |
3 | Abilene | NaN | DISK | KS | 6/1/1931 13:00 |
4 | New York Worlds Fair | NaN | LIGHT | NY | 4/18/1933 19:00 |
# select 'City' Series using bracket notation
ufo['City']
0 Ithaca
1 Willingboro
2 Holyoke
3 Abilene
4 New York Worlds Fair
...
18236 Grant Park
18237 Spirit Lake
18238 Eagle River
18239 Eagle River
18240 Ybor
Name: City, Length: 18241, dtype: object
type(ufo['City'])
# select 'City' Series using dot(.) notation
ufo.City
0 Ithaca
1 Willingboro
2 Holyoke
3 Abilene
4 New York Worlds Fair
...
18236 Grant Park
18237 Spirit Lake
18238 Eagle River
18239 Eagle River
18240 Ybor
Name: City, Length: 18241, dtype: object
Note
Bracket notation will always work, whereas dot notation has limitations
Dot notation doesn’t work if there are spaces in the Series name
Dot notation doesn’t work if the Series has the same name as a DataFrame method or attribute (like ‘head’ or ‘shape’)
Dot notation can’t be used to define the name of a new Series (see below)
# create a new 'Location' Series (must use bracket notation to define the Series name)
ufo['Location'] = ufo.City + ', ' + ufo.State
ufo.head()
City | Colors Reported | Shape Reported | State | Time | Location | |
---|---|---|---|---|---|---|
0 | Ithaca | NaN | TRIANGLE | NY | 6/1/1930 22:00 | Ithaca, NY |
1 | Willingboro | NaN | OTHER | NJ | 6/30/1930 20:00 | Willingboro, NJ |
2 | Holyoke | NaN | OVAL | CO | 2/15/1931 14:00 | Holyoke, CO |
3 | Abilene | NaN | DISK | KS | 6/1/1931 13:00 | Abilene, KS |
4 | New York Worlds Fair | NaN | LIGHT | NY | 4/18/1933 19:00 | New York Worlds Fair, NY |
Multiple Series¶
# select multiple series from dataframe
ufo[['City', 'State', 'Time']]
City | State | Time | |
---|---|---|---|
0 | Ithaca | NY | 6/1/1930 22:00 |
1 | Willingboro | NJ | 6/30/1930 20:00 |
2 | Holyoke | CO | 2/15/1931 14:00 |
3 | Abilene | KS | 6/1/1931 13:00 |
4 | New York Worlds Fair | NY | 4/18/1933 19:00 |
... | ... | ... | ... |
18236 | Grant Park | IL | 12/31/2000 23:00 |
18237 | Spirit Lake | IA | 12/31/2000 23:00 |
18238 | Eagle River | WI | 12/31/2000 23:45 |
18239 | Eagle River | WI | 12/31/2000 23:45 |
18240 | Ybor | FL | 12/31/2000 23:59 |
18241 rows × 3 columns